Improved Classification of Known and Unknown Network Traffic Flows Using Semi-supervised Machine Learning
نویسندگان
چکیده
Modern network traffic classification approaches apply machine learning techniques to statistical flow properties, allowing accurate classification even when traditional approaches fail. We base our approach to the task on a state-of-the-art semi-supervised classifier to identify known and unknown flows with little labelled training data. We propose a new algorithm for mapping clusters to classes to target classes that were previously difficult to classify. We also apply alternative statistical features. We find our approach has an accuracy of 95.10%, over 17% above the technique on which it is based. Additionally, our approach improves the classification performance on every class.
منابع مشابه
Classification of encrypted traffic for applications based on statistical features
Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...
متن کاملOffline/realtime traffic classification using semi-supervised learning
Identifying and categorizing network traffic by application type is challenging because of the continued evolution of applications, especially of those with a desire to be undetectable. The diminished effectiveness of port-based identification and the overheads of deep packet inspection approaches motivate us to classify traffic by exploiting distinctive flow characteristics of applications whe...
متن کاملBehavioral Analysis of Traffic Flow for an Effective Network Traffic Identification
Fast and accurate network traffic identification is becoming essential for network management, high quality of service control and early detection of network traffic abnormalities. Techniques based on statistical features of packet flows have recently become popular for network classification due to the limitations of traditional port and payload based methods. In this paper, we propose a metho...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملImplementation of Distance Based Semi Supervised Clustering and Probabilistic Assignment Technique for Network Traffic Classification
Network Traffic Classification is an important process in various network management activities like network planning, designing, workload characterization etc. Network traffic classification using traditional techniques such as well known port number based and payload analysis based techniques are no more effective because various applications uses port hopping and encryption technique to avoi...
متن کامل